108 results found.
Written
Corpus,
Language Type:
Multilingual
Languages:
Arabic Chinese English German Hindi Spanish Vietnamese
Availability:
Freely Available
License:
Size:
50+ GByte Production Status:
Existing-used
Use:
Machine Learning
-
Paper title:MLQA: Evaluating Cross-lingual Extractive Question Answering
-
Paper track:Long/Question Answering
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Patrick Lewis | Wikipedia | /N |
Documentation:
None
Speech/Written
Corpus,
Language Type:
Multilingual
Languages:
Catalan Chinese English Esperanto French German Italian Kabyle Kinyarwanda Persian Polish Russian Spanish Welsh
Availability:
Freely Available
License:
Creative Commons license
Size:
8.8k hoursProduction Status:
Existing-used
Use:
Speech Recognition/Understanding
-
Paper title:LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
-
Paper track:8.1 Feature extraction and low-level feature model/Oral Presentation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Laurent Besacier | Common Voice | /N |
Documentation:
https://arxiv.org/pdf/1912.06670.pdf, English, publicLanguage Type:
Multilingual
Languages:
Arabic Chinese English Finnish Hindi
Availability:
Freely Available
License:
BSD 3
Size:
<Not Specified> <Not Specified>Production Status:
Existing-updated
Use:
Corpus Creation/Annotation
-
Paper title:Gold Standard Annotations for Preposition and Verb Sense with Semantic Role Labels in Adult-Child Interactions
-
Paper track:Resource paper
-
Paper status:Accept Poster
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Author 1 | Lori Moon | University of Illinois at Urbana-Champaign | None |
| Author 2 | Christos Christodoulopoulos | Amazon | GB |
| Author 3 | Fisher Cynthia | University of Illinois | US |
| Author 4 | Sandra Franco | University of Illinois at Urbana-Champaign | N/A |
| Author 5 | Dan Roth | University of Illinois | US |
| Main Contact | Lori Moon | University of Illinois at Urbana-Champaign | None |
Documentation:
https://www.colorado.edu/ics/sites/default/files/attached-files/techreport02-09-jubilee.pdf
Written
Corpus,
Language Type:
Bilingual
Languages:
Chinese English
Availability:
Freely Available
License:
Size:
500 MByte Production Status:
Existing-used
Use:
Dialogue
-
Paper title:One Time of Interaction May Not Be Enough: Go Deep with an Interaction-over-Interaction Network for Response Selection in Dialogues
-
Paper track:Long/Dialogue and Interactive Systems
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Chongyang Tao | Douban Conversation Corpus | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
From Owner
License:
Gnu
Size:
2.69 GByte Production Status:
Newly created-finished
Use:
Emotion Recognition/Generation
-
Paper title:Coherent Comments Generation for Chinese Articles with a Graph-to-Sequence Model
-
Paper track:Long/Generation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Wei Li | Chinese Article Comment | /N |
Documentation:
None
Written
Corpus,
Language Type:
Multilingual
Languages:
Ancient Greek Arabic Chinese English Finnish Hebrew Korean Russian Swedish
Availability:
Freely Available
License:
CreativeCommons, Gnu
Size:
11814230 tokens Production Status:
Existing-used
Use:
Parsing and Tagging
-
Paper title:The (Non-)Utility of Structural Features in BiLSTM-based Dependency Parsers
-
Paper track:Long/Tagging, Chunking, Syntax and Parsing
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Agnieszka Falenska | Universal Dependencies 2.0 | /N |
Documentation:
https://universaldependencies.org/v2/
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
OpenSource
Size:
340 MByte Production Status:
Newly created-finished
Use:
Question Answering
-
Paper title:ChID: A Large-scale Chinese IDiom Dataset for Cloze Test
-
Paper track:Long/Resources and Evaluation
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | none nan | ChID dataset | /N |
Documentation:
None
Written
Tokenizer,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
The MIT License (MIT)
Size:
None Production Status:
Existing-used
Use:
Dialogue
-
Paper title:Learning to Abstract for Memory-augmented Conversational Response Generation
-
Paper track:Long/Dialogue and Interactive Systems
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Zhiliang Tian | Jieba | /N |
Documentation:
Yes, in Chinese, publicly available.
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
Size:
270k sentences Production Status:
Newly created-finished
Use:
Dialogue
-
Paper title:Proactive Human-Machine Conversation with Explicit Conversation Goal
-
Paper track:Long/Dialogue and Interactive Systems
-
Paper status:Accept
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Xiangyang Zhou | DuConv | /N |
Documentation:
None
Written
Corpus,
Language Type:
Monolingual
Languages:
Chinese
Availability:
Freely Available
License:
GNU General Public License
Size:
2.9 MByte Production Status:
Newly created-finished
Use:
Document Classification, Text categorisation
-
Paper title:DT-QDC: A Dataset for Question Comprehension in Online Test
-
Paper track:Long paper/
-
Paper status:Accept Oral
| Author Number | Name | Affiliation | Country |
|---|---|---|---|
| Main Contact | Sijin Wu | DT-QDC dataset | /N |
Documentation:
https://github.com/wusj18/DT-QDC




